Aim

This script shows how to estimate spillover from single metal spots on an agarose coated slide. This short script illustrates how the helper function ‘estimate_sm_from_imc_txtfol’ can be used to estimate a spillover matrix from a folder containing all acquisitions in .txt format in one single function call.

library(CATALYST)
library(data.table)
library(ggplot2)
library(flowCore)
library(dplyr)
library(dtplyr)
library(stringr)
library(ggpmisc)
source('spillover_imc_helpers.R')

setup the configuration variables

# a folder containing a complete single stain acquisition
fol_ss = '../data/Figure_S5/Spillover_Matrix_2'

Example for estimation of the spillover matrix from a folder of single stains with single stains parsed from naming: xxx_x_metal_x.txt? (E.g. Dy161 1-1000_8_Dy161_8.txt)

#' estimate_sm_from_imc_txtfol
#' Estimates spillover directly from a folder containing IMC .txt files
#'
#' @param fol_ss folder containing .txt acquisitions of IMC single stains
#' @param ssmetals_from_fn logical, Are the single stains correctly named xxx_x_metal_x.txt? (E.g. Dy161 1-1000_8_Dy161_8.txt
#' @param ssmetals Vector of masses of the single stains used. Required if ssmetals_from_file_fn is False
#' @param fn2ssmetal Optional: a named vector mapping the filenames to the single stain metal used (e.g. if it cannot be parsed from the filename)
#' @param remove_incorrect_bc Remove barcodes not matching the filename single stain annotation (requires either ssmetals_from_fn=T or fn2ssmetal )
#' @param minevents Minimal number of events (after debarcoding) that need to be present in a single stain in order that a spillover estimation is performed
#' @param bin_n_pixels Optional: integer, bin n consecutive pixels. Can be used if the intensities per pixel are to low (e.g. <200 counts)
#' @param ... Optional parameters will be passed to CATALYST::computeSpillmat 
res = estimate_sm_from_imc_txtfol(fol_ss, ssmetals_from_fn=T)
## Debarcoding data...
##  o ordering
##  o classifying events
## Normalizing...
## Computing deltas...
## Computing counts and yields...

Plot of the spillover matrix estimated (1:10000 is just to circumvent the requirement of the plotSpillmat function to have single stain masses provided)

CATALYST::plotSpillmat(1:10000,res[['sm']])
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`

Quality control: Distribution of medians per file

plot_file_medians(res[['data']])

In case the spillover matrix should be written out

#write.csv(res[['sm']],file = 'path/sm.csv')

Alternatively the function can also be called with a list of single stain masses, if the metals are not contained in the filename

ssmass = c(161, 162, 163, 164, 166, 167, 168, 170, 151, 153, 155, 156, 158, 160, 165, 113, 115, 175, 142, 143, 144, 145, 146, 148, 150, 141, 147, 149, 152, 154, 159, 169, 171, 172, 173, 174, 176)
res = estimate_sm_from_imc_txtfol(fol_ss, ssmetals_from_fn=F,ssmass = ssmass )
## Debarcoding data...
##  o ordering
##  o classifying events
## Normalizing...
## Computing deltas...
## Computing counts and yields...
sessionInfo()
## R version 3.4.1 (2017-06-30)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 14.04.5 LTS
## 
## Matrix products: default
## BLAS: /usr/lib/openblas-base/libblas.so.3
## LAPACK: /usr/lib/lapack/liblapack.so.3.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] stringi_1.1.5       ggpmisc_0.2.16      stringr_1.2.0      
## [4] dtplyr_0.0.2        dplyr_0.7.4         flowCore_1.42.3    
## [7] ggplot2_2.2.1       data.table_1.10.4-1 CATALYST_1.1.5     
## 
## loaded via a namespace (and not attached):
##  [1] Biobase_2.36.2      httr_1.3.1          tidyr_0.7.1        
##  [4] viridisLite_0.2.0   jsonlite_1.5        splines_3.4.1      
##  [7] gtools_3.5.0        shiny_1.0.5         assertthat_0.2.0   
## [10] stats4_3.4.1        yaml_2.1.16         robustbase_0.92-7  
## [13] backports_1.1.1     lattice_0.20-35     quantreg_5.33      
## [16] glue_1.1.1          digest_0.6.15       RColorBrewer_1.1-2 
## [19] minqa_1.2.4         colorspace_1.3-2    sandwich_2.4-0     
## [22] httpuv_1.3.5        htmltools_0.3.6     Matrix_1.2-11      
## [25] plyr_1.8.4          pcaPP_1.9-72        pkgconfig_2.0.1    
## [28] SparseM_1.77        xtable_1.8-2        purrr_0.2.3        
## [31] corpcor_1.6.9       mvtnorm_1.0-6       scales_0.5.0       
## [34] lme4_1.1-14         MatrixModels_0.4-1  tibble_1.3.4       
## [37] mgcv_1.8-22         car_2.1-5           TH.data_1.0-8      
## [40] nnet_7.3-12         BiocGenerics_0.22.1 lazyeval_0.2.0     
## [43] pbkrtest_0.4-7      mime_0.5            survival_2.41-3    
## [46] magrittr_1.5        evaluate_0.10.1     nlme_3.1-131       
## [49] MASS_7.3-47         graph_1.54.0        tools_3.4.1        
## [52] matrixStats_0.53.0  multcomp_1.4-8      plotly_4.7.1       
## [55] munsell_0.4.3       cluster_2.0.6       plotrix_3.7        
## [58] bindrcpp_0.2        compiler_3.4.1      rlang_0.1.2        
## [61] grid_3.4.1          nloptr_1.0.4        drc_3.0-1          
## [64] htmlwidgets_1.0     crosstalk_1.0.0     labeling_0.3       
## [67] rmarkdown_1.6       gtable_0.2.0        codetools_0.2-15   
## [70] reshape2_1.4.3      rrcov_1.4-3         R6_2.2.2           
## [73] gridExtra_2.3       nnls_1.4            zoo_1.8-0          
## [76] knitr_1.17          bindr_0.1           rprojroot_1.2      
## [79] parallel_3.4.1      Rcpp_0.12.15        DEoptimR_1.0-8